Overview

Dataset statistics

Number of variables29
Number of observations1574040
Missing cells2185154
Missing cells (%)4.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.8 GiB
Average record size in memory1.2 KiB

Variable types

Numeric12
DateTime2
Categorical15

Warnings

SECTOR has constant value "PRIVATE" Constant
SALARY_TYPE has constant value "1.0" Constant
COUNTRY_DESC has a high cardinality: 173 distinct values High cardinality
JOB_DESC has a high cardinality: 2285 distinct values High cardinality
ECONOMIC_ACT_DESC has a high cardinality: 967 distinct values High cardinality
COMPANY_NAME has a high cardinality: 122145 distinct values High cardinality
CIVIL_ID is highly correlated with AgeHigh correlation
Age is highly correlated with CIVIL_IDHigh correlation
CIVIL_ID is highly correlated with AgeHigh correlation
Age is highly correlated with CIVIL_IDHigh correlation
CIVIL_ID is highly correlated with AgeHigh correlation
Age is highly correlated with CIVIL_IDHigh correlation
ONR_GVRN_CODE is highly correlated with GOVERNORATE_DESC and 1 other fieldsHigh correlation
MARITAL_STATUS_DESC is highly correlated with RLGION_DESC and 2 other fieldsHigh correlation
GENDER_CODE is highly correlated with GENDER_DESCHigh correlation
Age is highly correlated with BIRTH_DATE and 2 other fieldsHigh correlation
COUNTRY_CODE is highly correlated with RLGION_CODEHigh correlation
BIRTH_DATE is highly correlated with Age and 2 other fieldsHigh correlation
Age Group is highly correlated with Age and 2 other fieldsHigh correlation
EDUCATION_DESC is highly correlated with RLGION_DESC and 2 other fieldsHigh correlation
GOVERNORATE_DESC is highly correlated with ONR_GVRN_CODE and 1 other fieldsHigh correlation
CIVIL_ID is highly correlated with Age and 2 other fieldsHigh correlation
RLGION_DESC is highly correlated with MARITAL_STATUS_DESC and 3 other fieldsHigh correlation
RLGION_CODE is highly correlated with COUNTRY_CODE and 1 other fieldsHigh correlation
EDUCATION_CODE is highly correlated with MAJOR_CODEHigh correlation
MAJOR_CODE is highly correlated with EDUCATION_DESC and 1 other fieldsHigh correlation
جنسية is highly correlated with MARITAL_STATUS_DESC and 2 other fieldsHigh correlation
GENDER_DESC is highly correlated with GENDER_CODEHigh correlation
ONR_ID is highly correlated with ONR_GVRN_CODE and 1 other fieldsHigh correlation
MARITAL_STATUS_CODE is highly correlated with MARITAL_STATUS_DESCHigh correlation
RLGION_CODE is highly correlated with SECTOR and 1 other fieldsHigh correlation
MARITAL_STATUS_DESC is highly correlated with SECTORHigh correlation
GENDER_CODE is highly correlated with GENDER_DESC and 1 other fieldsHigh correlation
جنسية is highly correlated with SECTOR and 2 other fieldsHigh correlation
GENDER_DESC is highly correlated with GENDER_CODE and 1 other fieldsHigh correlation
SECTOR is highly correlated with RLGION_CODE and 8 other fieldsHigh correlation
Age Group is highly correlated with SECTORHigh correlation
EDUCATION_DESC is highly correlated with جنسية and 1 other fieldsHigh correlation
GOVERNORATE_DESC is highly correlated with SECTORHigh correlation
RLGION_DESC is highly correlated with RLGION_CODE and 2 other fieldsHigh correlation
RLGION_CODE has 71492 (4.5%) missing values Missing
EDUCATION_CODE has 88711 (5.6%) missing values Missing
MAJOR_CODE has 88711 (5.6%) missing values Missing
SALARY_TYPE has 1574039 (> 99.9%) missing values Missing
ADDRESS_AUTO_NO has 354729 (22.5%) missing values Missing
ECONOMIC_ACT_CODE is highly skewed (γ1 = 39.20844929) Skewed
EDUCATION_CODE is highly skewed (γ1 = 338.9212143) Skewed

Reproduction

Analysis started2021-05-26 17:25:21.859082
Analysis finished2021-05-26 17:53:20.723521
Duration27 minutes and 58.86 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

CIVIL_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1574038
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.80909332 × 1011
Minimum1.780201 × 1011
Maximum5.200400698 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2021-05-26T22:53:21.565980image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1.780201 × 1011
5-th percentile2.620129012 × 1011
Q12.740819017 × 1011
median2.821010167 × 1011
Q32.890606059 × 1011
95-th percentile2.950425037 × 1011
Maximum5.200400698 × 1011
Range3.420199698 × 1011
Interquartile range (IQR)1.497870427 × 1010

Descriptive statistics

Standard deviation1.039015273 × 1010
Coefficient of variation (CV)0.03698756698
Kurtosis0.3013487943
Mean2.80909332 × 1011
Median Absolute Deviation (MAD)7048912426
Skewness-0.6759298858
Sum4.421625249 × 1017
Variance1.079552738 × 1020
MonotonicityNot monotonic
2021-05-26T22:53:21.816504image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.521113001 × 10112
 
< 0.1%
2.890425046 × 10112
 
< 0.1%
2.800222058 × 10111
 
< 0.1%
2.740814062 × 10111
 
< 0.1%
2.930101066 × 10111
 
< 0.1%
2.890214048 × 10111
 
< 0.1%
2.78042011 × 10111
 
< 0.1%
2.820308003 × 10111
 
< 0.1%
2.91010136 × 10111
 
< 0.1%
2.83010158 × 10111
 
< 0.1%
Other values (1574028)1574028
> 99.9%
ValueCountFrequency (%)
1.780201 × 10111
< 0.1%
1.891108 × 10111
< 0.1%
1.981207 × 10111
< 0.1%
2.080805 × 10111
< 0.1%
2.150204001 × 10111
< 0.1%
2.220301001 × 10111
< 0.1%
2.220515002 × 10111
< 0.1%
2.221222001 × 10111
< 0.1%
2.230701001 × 10111
< 0.1%
2.231208002 × 10111
< 0.1%
ValueCountFrequency (%)
5.200400698 × 10111
< 0.1%
3.140923029 × 10111
< 0.1%
3.130306023 × 10111
< 0.1%
3.031215015 × 10111
< 0.1%
3.021114013 × 10111
< 0.1%
3.021105012 × 10111
< 0.1%
3.021102008 × 10111
< 0.1%
3.020928016 × 10111
< 0.1%
3.020916012 × 10111
< 0.1%
3.020815011 × 10111
< 0.1%

BIRTH_DATE
Date

HIGH CORRELATION

Distinct97
Distinct (%)< 0.1%
Missing500
Missing (%)< 0.1%
Memory size12.0 MiB
Minimum1878-01-01 00:00:00
Maximum2049-01-01 00:00:00
2021-05-26T22:53:22.050220image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:53:22.383360image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

COUNTRY_CODE
Real number (ℝ≥0)

HIGH CORRELATION

Distinct173
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean459.0476214
Minimum0
Maximum883
Zeros9
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2021-05-26T22:53:22.639541image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile106
Q1107
median702
Q3709
95-th percentile722
Maximum883
Range883
Interquartile range (IQR)602

Descriptive statistics

Standard deviation297.7551975
Coefficient of variation (CV)0.6486368378
Kurtosis-1.878648097
Mean459.0476214
Median Absolute Deviation (MAD)19
Skewness-0.3207666481
Sum722559318
Variance88658.15766
MonotonicityNot monotonic
2021-05-26T22:53:22.851015image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
709477043
30.3%
107447370
28.4%
702169610
 
10.8%
72172319
 
4.6%
10171308
 
4.5%
72267942
 
4.3%
11057882
 
3.7%
72047438
 
3.0%
10622864
 
1.5%
71120522
 
1.3%
Other values (163)119742
 
7.6%
ValueCountFrequency (%)
09
 
< 0.1%
10171308
 
4.5%
103984
 
0.1%
1043218
 
0.2%
10521
 
< 0.1%
10622864
 
1.5%
107447370
28.4%
1084451
 
0.3%
11057882
 
3.7%
11118862
 
1.2%
ValueCountFrequency (%)
8832
 
< 0.1%
8827
 
< 0.1%
88141
 
< 0.1%
88043
 
< 0.1%
87030
 
< 0.1%
860795
0.1%
85025
 
< 0.1%
83994
 
< 0.1%
838152
 
< 0.1%
8374
 
< 0.1%

COUNTRY_DESC
Categorical

HIGH CARDINALITY

Distinct173
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size143.8 MiB
الهنــد
477043 
مصـــر
447370 
بنجلاديش
169610 
باكستان
72319 
الكويت
71308 
Other values (168)
336390 

Length

Max length26
Median length7
Mean length6.891821046
Min length0

Characters and Unicode

Total characters10848002
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)< 0.1%

Sample

1st rowالكويت
2nd rowباكستان
3rd rowمصـــر
4th rowمصـــر
5th rowمصـــر

Common Values

ValueCountFrequency (%)
الهنــد477043
30.3%
مصـــر447370
28.4%
بنجلاديش169610
 
10.8%
باكستان72319
 
4.6%
الكويت71308
 
4.5%
الفلبين67942
 
4.3%
ســوريا57882
 
3.7%
نيبال47438
 
3.0%
الأردن22864
 
1.5%
ايــران20522
 
1.3%
Other values (163)119742
 
7.6%

Length

2021-05-26T22:53:23.492283image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
الهنــد477043
29.3%
مصـــر447370
27.4%
بنجلاديش169610
 
10.4%
باكستان72319
 
4.4%
الكويت71308
 
4.4%
الفلبين67942
 
4.2%
ســوريا57882
 
3.6%
نيبال47438
 
2.9%
الأردن22864
 
1.4%
ايــران20522
 
1.3%
Other values (183)175834
 
10.8%

Most occurring characters

ValueCountFrequency (%)
ـ2518286
23.2%
ا1311399
12.1%
ل1052841
9.7%
ن1019183
9.4%
د694604
 
6.4%
ر593713
 
5.5%
ي555628
 
5.1%
ه486954
 
4.5%
م475269
 
4.4%
ص453742
 
4.2%
Other values (24)1686383
15.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter8273601
76.3%
Modifier Letter2518286
 
23.2%
Space Separator56114
 
0.5%
Dash Punctuation1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ا1311399
15.9%
ل1052841
12.7%
ن1019183
12.3%
د694604
8.4%
ر593713
7.2%
ي555628
6.7%
ه486954
 
5.9%
م475269
 
5.7%
ص453742
 
5.5%
ب392287
 
4.7%
Other values (21)1237981
15.0%
Modifier Letter
ValueCountFrequency (%)
ـ2518286
100.0%
Space Separator
ValueCountFrequency (%)
56114
100.0%
Dash Punctuation
ValueCountFrequency (%)
-1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Arabic8273601
76.3%
Common2574401
 
23.7%

Most frequent character per script

Arabic
ValueCountFrequency (%)
ا1311399
15.9%
ل1052841
12.7%
ن1019183
12.3%
د694604
8.4%
ر593713
7.2%
ي555628
6.7%
ه486954
 
5.9%
م475269
 
5.7%
ص453742
 
5.5%
ب392287
 
4.7%
Other values (21)1237981
15.0%
Common
ValueCountFrequency (%)
ـ2518286
97.8%
56114
 
2.2%
-1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Arabic10791887
99.5%
ASCII56115
 
0.5%

Most frequent character per block

Arabic
ValueCountFrequency (%)
ـ2518286
23.3%
ا1311399
12.2%
ل1052841
9.8%
ن1019183
9.4%
د694604
 
6.4%
ر593713
 
5.5%
ي555628
 
5.1%
ه486954
 
4.5%
م475269
 
4.4%
ص453742
 
4.2%
Other values (22)1630268
15.1%
ASCII
ValueCountFrequency (%)
56114
> 99.9%
-1
 
< 0.1%

GENDER_CODE
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing78
Missing (%)< 0.1%
Memory size90.1 MiB
1.0
1406082 
2.0
167880 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters4721886
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.01406082
89.3%
2.0167880
 
10.7%
(Missing)78
 
< 0.1%

Length

2021-05-26T22:53:23.974586image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-26T22:53:24.163973image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1.01406082
89.3%
2.0167880
 
10.7%

Most occurring characters

ValueCountFrequency (%)
.1573962
33.3%
01573962
33.3%
11406082
29.8%
2167880
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3147924
66.7%
Other Punctuation1573962
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01573962
50.0%
11406082
44.7%
2167880
 
5.3%
Other Punctuation
ValueCountFrequency (%)
.1573962
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common4721886
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.1573962
33.3%
01573962
33.3%
11406082
29.8%
2167880
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII4721886
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.1573962
33.3%
01573962
33.3%
11406082
29.8%
2167880
 
3.6%

GENDER_DESC
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size132.4 MiB
ذكر
1406082 
انثى
167880 
 
78

Length

Max length4
Median length3
Mean length3.106506823
Min length0

Characters and Unicode

Total characters4889766
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowانثى
2nd rowذكر
3rd rowذكر
4th rowذكر
5th rowذكر

Common Values

ValueCountFrequency (%)
ذكر1406082
89.3%
انثى167880
 
10.7%
78
 
< 0.1%

Length

2021-05-26T22:53:24.560643image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-26T22:53:24.711363image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
ذكر1406082
89.3%
انثى167880
 
10.7%

Most occurring characters

ValueCountFrequency (%)
ذ1406082
28.8%
ك1406082
28.8%
ر1406082
28.8%
ا167880
 
3.4%
ن167880
 
3.4%
ث167880
 
3.4%
ى167880
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter4889766
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ذ1406082
28.8%
ك1406082
28.8%
ر1406082
28.8%
ا167880
 
3.4%
ن167880
 
3.4%
ث167880
 
3.4%
ى167880
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Arabic4889766
100.0%

Most frequent character per script

Arabic
ValueCountFrequency (%)
ذ1406082
28.8%
ك1406082
28.8%
ر1406082
28.8%
ا167880
 
3.4%
ن167880
 
3.4%
ث167880
 
3.4%
ى167880
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
Arabic4889766
100.0%

Most frequent character per block

Arabic
ValueCountFrequency (%)
ذ1406082
28.8%
ك1406082
28.8%
ر1406082
28.8%
ا167880
 
3.4%
ن167880
 
3.4%
ث167880
 
3.4%
ى167880
 
3.4%

RLGION_CODE
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)< 0.1%
Missing71492
Missing (%)4.5%
Memory size88.7 MiB
1.0
1005174 
2.0
235153 
0.0
218952 
3.0
 
41858
4.0
 
1411

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters4507644
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.01005174
63.9%
2.0235153
 
14.9%
0.0218952
 
13.9%
3.041858
 
2.7%
4.01411
 
0.1%
(Missing)71492
 
4.5%

Length

2021-05-26T22:53:25.130181image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-26T22:53:25.337626image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1.01005174
66.9%
2.0235153
 
15.7%
0.0218952
 
14.6%
3.041858
 
2.8%
4.01411
 
0.1%

Most occurring characters

ValueCountFrequency (%)
01721500
38.2%
.1502548
33.3%
11005174
22.3%
2235153
 
5.2%
341858
 
0.9%
41411
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3005096
66.7%
Other Punctuation1502548
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01721500
57.3%
11005174
33.4%
2235153
 
7.8%
341858
 
1.4%
41411
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.1502548
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common4507644
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01721500
38.2%
.1502548
33.3%
11005174
22.3%
2235153
 
5.2%
341858
 
0.9%
41411
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII4507644
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01721500
38.2%
.1502548
33.3%
11005174
22.3%
2235153
 
5.2%
341858
 
0.9%
41411
 
< 0.1%

RLGION_DESC
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size136.5 MiB
مسلم
1005174 
مسيحي
235153 
ديانات أخري
218952 
 
71492
هندوسي
 
41858

Length

Max length11
Median length4
Mean length4.994615766
Min length0

Characters and Unicode

Total characters7861725
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd rowمسلم
3rd rowمسلم
4th rowمسلم
5th rowمسلم

Common Values

ValueCountFrequency (%)
مسلم1005174
63.9%
مسيحي235153
 
14.9%
ديانات أخري218952
 
13.9%
71492
 
4.5%
هندوسي41858
 
2.7%
بوذي1411
 
0.1%

Length

2021-05-26T22:53:25.761557image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-26T22:53:25.889536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
مسلم1005174
58.4%
مسيحي235153
 
13.7%
ديانات218952
 
12.7%
أخري218952
 
12.7%
هندوسي41858
 
2.4%
بوذي1411
 
0.1%

Most occurring characters

ValueCountFrequency (%)
م2245501
28.6%
س1282185
16.3%
ل1005174
12.8%
ي951479
12.1%
ا437904
 
5.6%
د260810
 
3.3%
ن260810
 
3.3%
ح235153
 
3.0%
ت218952
 
2.8%
218952
 
2.8%
Other values (7)744805
 
9.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter7642773
97.2%
Space Separator218952
 
2.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
م2245501
29.4%
س1282185
16.8%
ل1005174
13.2%
ي951479
12.4%
ا437904
 
5.7%
د260810
 
3.4%
ن260810
 
3.4%
ح235153
 
3.1%
ت218952
 
2.9%
أ218952
 
2.9%
Other values (6)525853
 
6.9%
Space Separator
ValueCountFrequency (%)
218952
100.0%

Most occurring scripts

ValueCountFrequency (%)
Arabic7642773
97.2%
Common218952
 
2.8%

Most frequent character per script

Arabic
ValueCountFrequency (%)
م2245501
29.4%
س1282185
16.8%
ل1005174
13.2%
ي951479
12.4%
ا437904
 
5.7%
د260810
 
3.4%
ن260810
 
3.4%
ح235153
 
3.1%
ت218952
 
2.9%
أ218952
 
2.9%
Other values (6)525853
 
6.9%
Common
ValueCountFrequency (%)
218952
100.0%

Most occurring blocks

ValueCountFrequency (%)
Arabic7642773
97.2%
ASCII218952
 
2.8%

Most frequent character per block

Arabic
ValueCountFrequency (%)
م2245501
29.4%
س1282185
16.8%
ل1005174
13.2%
ي951479
12.4%
ا437904
 
5.7%
د260810
 
3.4%
ن260810
 
3.4%
ح235153
 
3.1%
ت218952
 
2.9%
أ218952
 
2.9%
Other values (6)525853
 
6.9%
ASCII
ValueCountFrequency (%)
218952
100.0%

JOB_CODE
Real number (ℝ≥0)

Distinct2297
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66553.96383
Minimum0
Maximum3631153
Zeros12
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2021-05-26T22:53:26.124812image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6582
Q143333
median62450
Q398515
95-th percentile99410
Maximum3631153
Range3631153
Interquartile range (IQR)55182

Descriptive statistics

Standard deviation33072.72806
Coefficient of variation (CV)0.4969310039
Kurtosis829.6390326
Mean66553.96383
Median Absolute Deviation (MAD)32540
Skewness7.780544805
Sum1.047586012 × 1011
Variance1093805341
MonotonicityNot monotonic
2021-05-26T22:53:26.379618image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99320164555
 
10.5%
98515107114
 
6.8%
4529099493
 
6.3%
5522063878
 
4.1%
6219039552
 
2.5%
9399032018
 
2.0%
9989027331
 
1.7%
9863027231
 
1.7%
9856523128
 
1.5%
5325020053
 
1.3%
Other values (2287)969687
61.6%
ValueCountFrequency (%)
012
 
< 0.1%
661
 
< 0.1%
100241
< 0.1%
111014
 
< 0.1%
11202
 
< 0.1%
11401
 
< 0.1%
1190281
< 0.1%
11911
 
< 0.1%
11922
 
< 0.1%
11932
 
< 0.1%
ValueCountFrequency (%)
36311539
 
< 0.1%
22124141
 
< 0.1%
22121644
 
< 0.1%
14310131
 
< 0.1%
43610335
 
< 0.1%
43610099
 
< 0.1%
99982379
 
< 0.1%
99981598
 
< 0.1%
99980139
 
< 0.1%
999701680
0.1%

JOB_DESC
Categorical

HIGH CARDINALITY

Distinct2285
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size152.9 MiB
عامل عادى خفيف
164555 
سائق مركبه خفيفه
 
107114
بائع
 
99493
عامل نظافة
 
63878
عامل زراعى
 
39552
Other values (2280)
1099448 

Length

Max length49
Median length10
Mean length9.950757287
Min length0

Characters and Unicode

Total characters15662890
Distinct characters40
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique271 ?
Unique (%)< 0.1%

Sample

1st rowمسئول
2nd rowحداد
3rd rowنقاش
4th rowنقاش
5th rowفنى كهربائي

Common Values

ValueCountFrequency (%)
عامل عادى خفيف164555
 
10.5%
سائق مركبه خفيفه107114
 
6.8%
بائع99493
 
6.3%
عامل نظافة63878
 
4.1%
عامل زراعى39552
 
2.5%
عامل انتاج32018
 
2.0%
عامل فنى27331
 
1.7%
سائق معدات ثقيلة27231
 
1.7%
سائق شاحنة23128
 
1.5%
جارسون20053
 
1.3%
Other values (2275)969687
61.6%

Length

2021-05-26T22:53:29.777596image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
عامل415522
 
13.1%
سائق209989
 
6.6%
خفيف168466
 
5.3%
عادى165901
 
5.2%
فنى131392
 
4.1%
مركبه110001
 
3.5%
خفيفه107114
 
3.4%
بائع103313
 
3.3%
نظافة63997
 
2.0%
مدير50289
 
1.6%
Other values (1458)1652698
52.0%

Most occurring characters

ValueCountFrequency (%)
ا2080758
 
13.3%
1605594
 
10.3%
م1427625
 
9.1%
ع1006742
 
6.4%
ف887638
 
5.7%
ي768011
 
4.9%
ل763612
 
4.9%
ر694398
 
4.4%
ن593996
 
3.8%
ب587282
 
3.7%
Other values (30)5247234
33.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter14050278
89.7%
Space Separator1605594
 
10.3%
Other Punctuation7010
 
< 0.1%
Open Punctuation4
 
< 0.1%
Close Punctuation4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ا2080758
14.8%
م1427625
 
10.2%
ع1006742
 
7.2%
ف887638
 
6.3%
ي768011
 
5.5%
ل763612
 
5.4%
ر694398
 
4.9%
ن593996
 
4.2%
ب587282
 
4.2%
د537296
 
3.8%
Other values (26)4702920
33.5%
Space Separator
ValueCountFrequency (%)
1605594
100.0%
Other Punctuation
ValueCountFrequency (%)
/7010
100.0%
Open Punctuation
ValueCountFrequency (%)
(4
100.0%
Close Punctuation
ValueCountFrequency (%)
)4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Arabic14050278
89.7%
Common1612612
 
10.3%

Most frequent character per script

Arabic
ValueCountFrequency (%)
ا2080758
14.8%
م1427625
 
10.2%
ع1006742
 
7.2%
ف887638
 
6.3%
ي768011
 
5.5%
ل763612
 
5.4%
ر694398
 
4.9%
ن593996
 
4.2%
ب587282
 
4.2%
د537296
 
3.8%
Other values (26)4702920
33.5%
Common
ValueCountFrequency (%)
1605594
99.6%
/7010
 
0.4%
(4
 
< 0.1%
)4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Arabic14050278
89.7%
ASCII1612612
 
10.3%

Most frequent character per block

Arabic
ValueCountFrequency (%)
ا2080758
14.8%
م1427625
 
10.2%
ع1006742
 
7.2%
ف887638
 
6.3%
ي768011
 
5.5%
ل763612
 
5.4%
ر694398
 
4.9%
ن593996
 
4.2%
ب587282
 
4.2%
د537296
 
3.8%
Other values (26)4702920
33.5%
ASCII
ValueCountFrequency (%)
1605594
99.6%
/7010
 
0.4%
(4
 
< 0.1%
)4
 
< 0.1%

SECTOR
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size96.1 MiB
PRIVATE
1574040 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters11018280
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRIVATE
2nd rowPRIVATE
3rd rowPRIVATE
4th rowPRIVATE
5th rowPRIVATE

Common Values

ValueCountFrequency (%)
PRIVATE1574040
100.0%

Length

2021-05-26T22:53:30.212971image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-26T22:53:30.341291image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
private1574040
100.0%

Most occurring characters

ValueCountFrequency (%)
P1574040
14.3%
R1574040
14.3%
I1574040
14.3%
V1574040
14.3%
A1574040
14.3%
T1574040
14.3%
E1574040
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter11018280
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P1574040
14.3%
R1574040
14.3%
I1574040
14.3%
V1574040
14.3%
A1574040
14.3%
T1574040
14.3%
E1574040
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin11018280
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P1574040
14.3%
R1574040
14.3%
I1574040
14.3%
V1574040
14.3%
A1574040
14.3%
T1574040
14.3%
E1574040
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII11018280
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P1574040
14.3%
R1574040
14.3%
I1574040
14.3%
V1574040
14.3%
A1574040
14.3%
T1574040
14.3%
E1574040
14.3%

ECONOMIC_ACT_CODE
Real number (ℝ≥0)

SKEWED

Distinct981
Distinct (%)0.1%
Missing5764
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean60537.89821
Minimum8
Maximum3720003
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2021-05-26T22:53:30.464523image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile11045
Q151201
median61246
Q371141
95-th percentile95201
Maximum3720003
Range3719995
Interquartile range (IQR)19940

Descriptive statistics

Standard deviation33119.90792
Coefficient of variation (CV)0.547093786
Kurtosis3926.438931
Mean60537.89821
Median Absolute Deviation (MAD)9924
Skewness39.20844929
Sum9.494013285 × 1010
Variance1096928301
MonotonicityNot monotonic
2021-05-26T22:53:30.662910image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
61244173082
 
11.0%
6124682254
 
5.2%
6310264775
 
4.1%
6124558902
 
3.7%
5110043661
 
2.8%
6215937392
 
2.4%
7114136597
 
2.3%
9200730410
 
1.9%
1103723693
 
1.5%
5120122449
 
1.4%
Other values (971)995061
63.2%
ValueCountFrequency (%)
81
 
< 0.1%
3217
 
< 0.1%
5127
 
< 0.1%
636
 
< 0.1%
735
 
< 0.1%
825
 
< 0.1%
8311
 
< 0.1%
921138
0.1%
941
 
< 0.1%
11256
 
< 0.1%
ValueCountFrequency (%)
372000340
 
< 0.1%
9310303
 
< 0.1%
9310049
 
< 0.1%
931002374
 
< 0.1%
9310015
 
< 0.1%
85410833
 
< 0.1%
39101256
 
< 0.1%
391011138
 
< 0.1%
39000114
 
< 0.1%
11114120888
1.3%

ECONOMIC_ACT_DESC
Categorical

HIGH CARDINALITY

Distinct967
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size204.2 MiB
التجارة العامة و المقاولات
173082 
الادارة العامة ( ادارة الشركات )
 
82254
المطاعم
 
64775
التجارة العامة
 
58902
المقاولات العامة للمباني
 
43664
Other values (962)
1151363 

Length

Max length104
Median length26
Mean length27.04982275
Min length0

Characters and Unicode

Total characters42577503
Distinct characters44
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)< 0.1%

Sample

1st rowالتجارة العامة و المقاولات
2nd rowالتجارة العامة و المقاولات
3rd rowالتجارة العامة و المقاولات
4th rowالتجارة العامة و المقاولات
5th rowمقاولات انشاءات كهربائية وميكانيكية مثل محطات توليد الكهرباء

Common Values

ValueCountFrequency (%)
التجارة العامة و المقاولات173082
 
11.0%
الادارة العامة ( ادارة الشركات )82254
 
5.2%
المطاعم64775
 
4.1%
التجارة العامة58902
 
3.7%
المقاولات العامة للمباني43664
 
2.8%
الاسواق المركزية37392
 
2.4%
نقل البضائع داخل الكويت36597
 
2.3%
مقاولات تنظيف المبانى و الشوارع30410
 
1.9%
اعمال هندسية وتوريد وانشاءات23693
 
1.5%
مقاولات انشاء ورصف الطرق والشوارع وغيرها22449
 
1.4%
Other values (957)1000822
63.6%

Length

2021-05-26T22:53:31.161212image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
العامة371632
 
5.8%
و334939
 
5.2%
242675
 
3.8%
التجارة232109
 
3.6%
المقاولات217400
 
3.4%
تجارة150639
 
2.4%
مقاولات135790
 
2.1%
ادارة90802
 
1.4%
الشركات82362
 
1.3%
الادارة82254
 
1.3%
Other values (1868)4467693
69.7%

Most occurring characters

ValueCountFrequency (%)
ا8960727
21.0%
4853698
11.4%
ل4631080
10.9%
ت2363159
 
5.6%
م2287871
 
5.4%
و2193804
 
5.2%
ة2064114
 
4.8%
ر1849187
 
4.3%
ي1781275
 
4.2%
ن1179066
 
2.8%
Other values (34)10413522
24.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter37400018
87.8%
Space Separator4853698
 
11.4%
Open Punctuation152739
 
0.4%
Close Punctuation148328
 
0.3%
Other Punctuation16556
 
< 0.1%
Dash Punctuation6164
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ا8960727
24.0%
ل4631080
12.4%
ت2363159
 
6.3%
م2287871
 
6.1%
و2193804
 
5.9%
ة2064114
 
5.5%
ر1849187
 
4.9%
ي1781275
 
4.8%
ن1179066
 
3.2%
ع1138231
 
3.0%
Other values (26)8951504
23.9%
Other Punctuation
ValueCountFrequency (%)
?15912
96.1%
.572
 
3.5%
,68
 
0.4%
/4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
4853698
100.0%
Open Punctuation
ValueCountFrequency (%)
(152739
100.0%
Close Punctuation
ValueCountFrequency (%)
)148328
100.0%
Dash Punctuation
ValueCountFrequency (%)
-6164
100.0%

Most occurring scripts

ValueCountFrequency (%)
Arabic37400018
87.8%
Common5177485
 
12.2%

Most frequent character per script

Arabic
ValueCountFrequency (%)
ا8960727
24.0%
ل4631080
12.4%
ت2363159
 
6.3%
م2287871
 
6.1%
و2193804
 
5.9%
ة2064114
 
5.5%
ر1849187
 
4.9%
ي1781275
 
4.8%
ن1179066
 
3.2%
ع1138231
 
3.0%
Other values (26)8951504
23.9%
Common
ValueCountFrequency (%)
4853698
93.7%
(152739
 
3.0%
)148328
 
2.9%
?15912
 
0.3%
-6164
 
0.1%
.572
 
< 0.1%
,68
 
< 0.1%
/4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Arabic37400018
87.8%
ASCII5177485
 
12.2%

Most frequent character per block

Arabic
ValueCountFrequency (%)
ا8960727
24.0%
ل4631080
12.4%
ت2363159
 
6.3%
م2287871
 
6.1%
و2193804
 
5.9%
ة2064114
 
5.5%
ر1849187
 
4.9%
ي1781275
 
4.8%
ن1179066
 
3.2%
ع1138231
 
3.0%
Other values (26)8951504
23.9%
ASCII
ValueCountFrequency (%)
4853698
93.7%
(152739
 
3.0%
)148328
 
2.9%
?15912
 
0.3%
-6164
 
0.1%
.572
 
< 0.1%
,68
 
< 0.1%
/4
 
< 0.1%

EDUCATION_CODE
Real number (ℝ)

HIGH CORRELATION
MISSING
SKEWED

Distinct55
Distinct (%)< 0.1%
Missing88711
Missing (%)5.6%
Infinite0
Infinite (%)0.0%
Mean180.4542314
Minimum-1
Maximum9700800
Zeros0
Zeros (%)0.0%
Negative826
Negative (%)0.1%
Memory size12.0 MiB
2021-05-26T22:53:31.369912image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile14
Q135
median45
Q345
95-th percentile55
Maximum9700800
Range9700801
Interquartile range (IQR)10

Descriptive statistics

Standard deviation15856.87138
Coefficient of variation (CV)87.87198426
Kurtosis190487.524
Mean180.4542314
Median Absolute Deviation (MAD)0
Skewness338.9212143
Sum268033903
Variance251440369.9
MonotonicityNot monotonic
2021-05-26T22:53:31.583858image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
45801314
50.9%
35371240
23.6%
14156902
 
10.0%
55104106
 
6.6%
7122210
 
1.4%
2018131
 
1.2%
704910
 
0.3%
132387
 
0.2%
111684
 
0.1%
-1826
 
0.1%
Other values (45)1619
 
0.1%
(Missing)88711
 
5.6%
ValueCountFrequency (%)
-1826
 
0.1%
41
 
< 0.1%
10683
 
< 0.1%
111684
 
0.1%
12734
 
< 0.1%
132387
 
0.2%
14156902
 
10.0%
2018131
 
1.2%
35371240
23.6%
45801314
50.9%
ValueCountFrequency (%)
97008002
 
< 0.1%
9798371
 
< 0.1%
9798361
 
< 0.1%
9798351
 
< 0.1%
97983011
 
< 0.1%
97982318
< 0.1%
9798227
 
< 0.1%
9798211
 
< 0.1%
97982037
< 0.1%
9798091
 
< 0.1%

EDUCATION_DESC
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size136.8 MiB
متوسط
801314 
ثانوية
371240 
جامعى
144232 
ابتدائي
104106 
94649 
Other values (7)
 
58499

Length

Max length35
Median length5
Mean length5.268034485
Min length0

Characters and Unicode

Total characters8292097
Distinct characters23
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd rowمتوسط
3rd rowمتوسط
4th rowمتوسط
5th rowمتوسط

Common Values

ValueCountFrequency (%)
متوسط801314
50.9%
ثانوية371240
23.6%
جامعى144232
 
9.2%
ابتدائي104106
 
6.6%
94649
 
6.0%
خبرة وبدون مؤهل22210
 
1.4%
دبلوم18131
 
1.2%
جامعي12670
 
0.8%
دبلوم دراسات عليا سنة بعد الجامعى2387
 
0.2%
ماجستير1684
 
0.1%
Other values (2)1417
 
0.1%

Length

2021-05-26T22:53:31.985795image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
متوسط801314
52.1%
ثانوية371240
24.1%
جامعى144232
 
9.4%
ابتدائي104106
 
6.8%
وبدون22210
 
1.4%
مؤهل22210
 
1.4%
خبرة22210
 
1.4%
دبلوم21252
 
1.4%
جامعي12670
 
0.8%
دراسات3121
 
0.2%
Other values (7)14851
 
1.0%

Most occurring characters

ValueCountFrequency (%)
و1238909
14.9%
م1006483
12.1%
ت911642
11.0%
س809240
9.8%
ط801314
9.7%
ا755060
9.1%
ي492821
 
5.9%
ن397305
 
4.8%
ة395837
 
4.8%
ث371240
 
4.5%
Other values (13)1112246
13.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter8232072
99.3%
Space Separator60025
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
و1238909
15.0%
م1006483
12.2%
ت911642
11.1%
س809240
9.8%
ط801314
9.7%
ا755060
9.2%
ي492821
 
6.0%
ن397305
 
4.8%
ة395837
 
4.8%
ث371240
 
4.5%
Other values (12)1052221
12.8%
Space Separator
ValueCountFrequency (%)
60025
100.0%

Most occurring scripts

ValueCountFrequency (%)
Arabic8232072
99.3%
Common60025
 
0.7%

Most frequent character per script

Arabic
ValueCountFrequency (%)
و1238909
15.0%
م1006483
12.2%
ت911642
11.1%
س809240
9.8%
ط801314
9.7%
ا755060
9.2%
ي492821
 
6.0%
ن397305
 
4.8%
ة395837
 
4.8%
ث371240
 
4.5%
Other values (12)1052221
12.8%
Common
ValueCountFrequency (%)
60025
100.0%

Most occurring blocks

ValueCountFrequency (%)
Arabic8232072
99.3%
ASCII60025
 
0.7%

Most frequent character per block

Arabic
ValueCountFrequency (%)
و1238909
15.0%
م1006483
12.2%
ت911642
11.1%
س809240
9.8%
ط801314
9.7%
ا755060
9.2%
ي492821
 
6.0%
ن397305
 
4.8%
ة395837
 
4.8%
ث371240
 
4.5%
Other values (12)1052221
12.8%
ASCII
ValueCountFrequency (%)
60025
100.0%

MAJOR_CODE
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct1421
Distinct (%)0.1%
Missing88711
Missing (%)5.6%
Infinite0
Infinite (%)0.0%
Mean462556.0553
Minimum-1
Maximum9700800
Zeros0
Zeros (%)0.0%
Negative826
Negative (%)0.1%
Memory size12.0 MiB
2021-05-26T22:53:32.182312image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile55
Q1353002
median453501
Q3453501
95-th percentile971509
Maximum9700800
Range9700801
Interquartile range (IQR)100499

Descriptive statistics

Standard deviation237409.3255
Coefficient of variation (CV)0.5132552537
Kurtosis4.141190013
Mean462556.0553
Median Absolute Deviation (MAD)8
Skewness0.7669988393
Sum6.87047923 × 1011
Variance5.636318783 × 1010
MonotonicityNot monotonic
2021-05-26T22:53:32.385025image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
453501737903
46.9%
353002343902
21.8%
97008092795
 
5.9%
5573994
 
4.7%
45350938187
 
2.4%
55400030112
 
1.9%
97237027334
 
1.7%
4525224
 
1.6%
1412670
 
0.8%
97029611182
 
0.7%
Other values (1411)92026
 
5.8%
(Missing)88711
 
5.6%
ValueCountFrequency (%)
-1826
 
0.1%
41
 
< 0.1%
1412670
 
0.8%
201
 
< 0.1%
354
 
< 0.1%
4525224
 
1.6%
5573994
4.7%
704910
 
0.3%
3531
 
< 0.1%
21172
 
< 0.1%
ValueCountFrequency (%)
97008002
 
< 0.1%
9804015
 
< 0.1%
98031219
 
< 0.1%
9803087
 
< 0.1%
9803051
 
< 0.1%
980296133
< 0.1%
9802951
 
< 0.1%
9802936
 
< 0.1%
9802926
 
< 0.1%
98027352
 
< 0.1%

SALARY
Real number (ℝ≥0)

Distinct7224
Distinct (%)0.5%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean316.2937736
Minimum1.425
Maximum26000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2021-05-26T22:53:32.605060image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1.425
5-th percentile75
Q1100
median180
Q3350
95-th percentile1000
Maximum26000
Range25998.575
Interquartile range (IQR)250

Descriptive statistics

Standard deviation446.9902569
Coefficient of variation (CV)1.413212318
Kurtosis108.3040281
Mean316.2937736
Median Absolute Deviation (MAD)80
Skewness6.978051131
Sum497858735.2
Variance199800.2898
MonotonicityNot monotonic
2021-05-26T22:53:32.801187image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100176864
 
11.2%
250128443
 
8.2%
150126880
 
8.1%
75113054
 
7.2%
8072402
 
4.6%
12070589
 
4.5%
20065688
 
4.2%
30049466
 
3.1%
50043864
 
2.8%
45040260
 
2.6%
Other values (7214)686529
43.6%
ValueCountFrequency (%)
1.4251
 
< 0.1%
276
< 0.1%
2.122
 
< 0.1%
2.175
 
< 0.1%
2.22
 
< 0.1%
2.2672
 
< 0.1%
2.31
 
< 0.1%
2.311
 
< 0.1%
2.3311
 
< 0.1%
2.345
 
< 0.1%
ValueCountFrequency (%)
260001
 
< 0.1%
250001
 
< 0.1%
24003.4851
 
< 0.1%
228611
 
< 0.1%
200001
 
< 0.1%
175002
 
< 0.1%
153661
 
< 0.1%
151001
 
< 0.1%
150008
< 0.1%
149991
 
< 0.1%

SALARY_TYPE
Categorical

CONSTANT
MISSING
REJECTED

Distinct1
Distinct (%)100.0%
Missing1574039
Missing (%)> 99.9%
Memory size60.0 MiB
1.0

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row1.0

Common Values

ValueCountFrequency (%)
1.01
 
< 0.1%
(Missing)1574039
> 99.9%

Length

2021-05-26T22:53:33.119783image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-26T22:53:33.209917image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1.01
100.0%

Most occurring characters

ValueCountFrequency (%)
11
33.3%
.1
33.3%
01
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2
66.7%
Other Punctuation1
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
11
50.0%
01
50.0%
Other Punctuation
ValueCountFrequency (%)
.1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
11
33.3%
.1
33.3%
01
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11
33.3%
.1
33.3%
01
33.3%

ONR_GVRN_CODE
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.006389291
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2021-05-26T22:53:33.279516image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.315616782
Coefficient of variation (CV)0.5779809733
Kurtosis-1.549241881
Mean4.006389291
Median Absolute Deviation (MAD)2
Skewness-0.03514552843
Sum6306217
Variance5.362081081
MonotonicityNot monotonic
2021-05-26T22:53:33.399286image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1378777
24.1%
7354674
22.5%
5221482
14.1%
6186621
11.9%
2179403
11.4%
3153552
9.8%
499531
 
6.3%
ValueCountFrequency (%)
1378777
24.1%
2179403
11.4%
3153552
9.8%
499531
 
6.3%
5221482
14.1%
6186621
11.9%
7354674
22.5%
ValueCountFrequency (%)
7354674
22.5%
6186621
11.9%
5221482
14.1%
499531
 
6.3%
3153552
9.8%
2179403
11.4%
1378777
24.1%

GOVERNORATE_DESC
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.4 MiB
محافظة العاصمة
378777 
العقود الحكومية
354674 
محافظة الفروانية
221482 
محافظة مبارك الكبير
186621 
محافظة حولي
179403 
Other values (2)
253083 

Length

Max length19
Median length14
Mean length14.75762624
Min length11

Characters and Unicode

Total characters23229094
Distinct characters21
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowمحافظة الفروانية
2nd rowمحافظة الفروانية
3rd rowمحافظة الفروانية
4th rowمحافظة الفروانية
5th rowمحافظة الفروانية

Common Values

ValueCountFrequency (%)
محافظة العاصمة378777
24.1%
العقود الحكومية354674
22.5%
محافظة الفروانية221482
14.1%
محافظة مبارك الكبير186621
11.9%
محافظة حولي179403
11.4%
محافظة الاحمدي153552
9.8%
محافظة الجهراء99531
 
6.3%

Length

2021-05-26T22:53:33.751064image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-26T22:53:33.880176image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
محافظة1219366
36.6%
العاصمة378777
 
11.4%
الحكومية354674
 
10.6%
العقود354674
 
10.6%
الفروانية221482
 
6.6%
مبارك186621
 
5.6%
الكبير186621
 
5.6%
حولي179403
 
5.4%
الاحمدي153552
 
4.6%
الجهراء99531
 
3.0%

Most occurring characters

ValueCountFrequency (%)
ا4008640
17.3%
م2292990
9.9%
ة2174299
9.4%
ل1928714
8.3%
ح1906995
8.2%
1760661
7.6%
ف1440848
 
6.2%
ظ1219366
 
5.2%
و1110233
 
4.8%
ي1095732
 
4.7%
Other values (11)4290616
18.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter21468433
92.4%
Space Separator1760661
 
7.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ا4008640
18.7%
م2292990
10.7%
ة2174299
10.1%
ل1928714
9.0%
ح1906995
8.9%
ف1440848
 
6.7%
ظ1219366
 
5.7%
و1110233
 
5.2%
ي1095732
 
5.1%
ع733451
 
3.4%
Other values (10)3557165
16.6%
Space Separator
ValueCountFrequency (%)
1760661
100.0%

Most occurring scripts

ValueCountFrequency (%)
Arabic21468433
92.4%
Common1760661
 
7.6%

Most frequent character per script

Arabic
ValueCountFrequency (%)
ا4008640
18.7%
م2292990
10.7%
ة2174299
10.1%
ل1928714
9.0%
ح1906995
8.9%
ف1440848
 
6.7%
ظ1219366
 
5.7%
و1110233
 
5.2%
ي1095732
 
5.1%
ع733451
 
3.4%
Other values (10)3557165
16.6%
Common
ValueCountFrequency (%)
1760661
100.0%

Most occurring blocks

ValueCountFrequency (%)
Arabic21468433
92.4%
ASCII1760661
 
7.6%

Most frequent character per block

Arabic
ValueCountFrequency (%)
ا4008640
18.7%
م2292990
10.7%
ة2174299
10.1%
ل1928714
9.0%
ح1906995
8.9%
ف1440848
 
6.7%
ظ1219366
 
5.7%
و1110233
 
5.2%
ي1095732
 
5.1%
ع733451
 
3.4%
Other values (10)3557165
16.6%
ASCII
ValueCountFrequency (%)
1760661
100.0%

MARITAL_STATUS_CODE
Real number (ℝ≥0)

HIGH CORRELATION

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.836254479
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2021-05-26T22:53:34.069328image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile2
Maximum11
Range10
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5376797457
Coefficient of variation (CV)0.2928133066
Kurtosis13.28246702
Mean1.836254479
Median Absolute Deviation (MAD)0
Skewness1.804203682
Sum2890338
Variance0.2890995089
MonotonicityNot monotonic
2021-05-26T22:53:34.215940image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
21232659
78.3%
1319096
 
20.3%
519116
 
1.2%
32394
 
0.2%
4764
 
< 0.1%
106
 
< 0.1%
113
 
< 0.1%
61
 
< 0.1%
71
 
< 0.1%
ValueCountFrequency (%)
1319096
 
20.3%
21232659
78.3%
32394
 
0.2%
4764
 
< 0.1%
519116
 
1.2%
61
 
< 0.1%
71
 
< 0.1%
106
 
< 0.1%
113
 
< 0.1%
ValueCountFrequency (%)
113
 
< 0.1%
106
 
< 0.1%
71
 
< 0.1%
61
 
< 0.1%
519116
 
1.2%
4764
 
< 0.1%
32394
 
0.2%
21232659
78.3%
1319096
 
20.3%

MARITAL_STATUS_DESC
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size137.0 MiB
متزوج
1232659 
أعزب
319096 
 
15968
غير معرف
 
3159
مطلق
 
2394

Length

Max length8
Median length5
Mean length4.75056733
Min length0

Characters and Unicode

Total characters7477583
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowمتزوج
2nd rowمتزوج
3rd rowأعزب
4th rowمتزوج
5th rowمتزوج

Common Values

ValueCountFrequency (%)
متزوج1232659
78.3%
أعزب319096
 
20.3%
15968
 
1.0%
غير معرف3159
 
0.2%
مطلق2394
 
0.2%
أرمل764
 
< 0.1%

Length

2021-05-26T22:53:34.573412image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-26T22:53:34.713517image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
متزوج1232659
79.0%
أعزب319096
 
20.4%
معرف3159
 
0.2%
غير3159
 
0.2%
مطلق2394
 
0.2%
أرمل764
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
ز1551755
20.8%
م1238976
16.6%
ت1232659
16.5%
و1232659
16.5%
ج1232659
16.5%
ع322255
 
4.3%
أ319860
 
4.3%
ب319096
 
4.3%
ر7082
 
0.1%
غ3159
 
< 0.1%
Other values (6)17423
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter7474424
> 99.9%
Space Separator3159
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ز1551755
20.8%
م1238976
16.6%
ت1232659
16.5%
و1232659
16.5%
ج1232659
16.5%
ع322255
 
4.3%
أ319860
 
4.3%
ب319096
 
4.3%
ر7082
 
0.1%
غ3159
 
< 0.1%
Other values (5)14264
 
0.2%
Space Separator
ValueCountFrequency (%)
3159
100.0%

Most occurring scripts

ValueCountFrequency (%)
Arabic7474424
> 99.9%
Common3159
 
< 0.1%

Most frequent character per script

Arabic
ValueCountFrequency (%)
ز1551755
20.8%
م1238976
16.6%
ت1232659
16.5%
و1232659
16.5%
ج1232659
16.5%
ع322255
 
4.3%
أ319860
 
4.3%
ب319096
 
4.3%
ر7082
 
0.1%
غ3159
 
< 0.1%
Other values (5)14264
 
0.2%
Common
ValueCountFrequency (%)
3159
100.0%

Most occurring blocks

ValueCountFrequency (%)
Arabic7474424
> 99.9%
ASCII3159
 
< 0.1%

Most frequent character per block

Arabic
ValueCountFrequency (%)
ز1551755
20.8%
م1238976
16.6%
ت1232659
16.5%
و1232659
16.5%
ج1232659
16.5%
ع322255
 
4.3%
أ319860
 
4.3%
ب319096
 
4.3%
ر7082
 
0.1%
غ3159
 
< 0.1%
Other values (5)14264
 
0.2%
ASCII
ValueCountFrequency (%)
3159
100.0%

COMPANY_NAME
Categorical

HIGH CARDINALITY

Distinct122145
Distinct (%)7.8%
Missing0
Missing (%)0.0%
Memory size247.6 MiB
مبنى الركاب الجديد بمطار الكويت (المبنى 11)
 
7109
شركة محمد حمود الشايع
 
5525
الشركة الاحمدية للمقاولات والتجارة
 
5292
(10377) شركة مطاحن الدقيق والمخابز الكويتية
 
4613
(ادارة العقودالحكوميه(اعادة قيد
 
3934
Other values (122140)
1547567 

Length

Max length128
Median length37
Mean length41.46124431
Min length3

Characters and Unicode

Total characters65261657
Distinct characters97
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25127 ?
Unique (%)1.6%

Sample

1st rowمؤسسة الرهيب الوطنية للتجارة العامة والمقاولات
2nd rowمؤسسة الرهيب الوطنية للتجارة العامة والمقاولات
3rd rowمؤسسة الرهيب الوطنية للتجارة العامة والمقاولات
4th rowمؤسسة الرهيب الوطنية للتجارة العامة والمقاولات
5th rowشركة الفيالق الكويتية للتجارة العامة والمقاولات

Common Values

ValueCountFrequency (%)
مبنى الركاب الجديد بمطار الكويت (المبنى 11)7109
 
0.5%
شركة محمد حمود الشايع5525
 
0.4%
الشركة الاحمدية للمقاولات والتجارة5292
 
0.3%
(10377) شركة مطاحن الدقيق والمخابز الكويتية4613
 
0.3%
(ادارة العقودالحكوميه(اعادة قيد3934
 
0.2%
مركز التجمع الجديد في جنوب وشرق الكويت GC-323756
 
0.2%
إنشاء وانجاز وصيانة مشروع معسكر الشيخ سالم العلي السالم الصباح3627
 
0.2%
شركة بدر الملا واخوانه3616
 
0.2%
مشروع خط انابيب التغذية لشركة نفط الكويت للمصفاة لجديدة NRP3590
 
0.2%
الشركة الكويتية للاغذية ( الامريكانا )3403
 
0.2%
Other values (122135)1529575
97.2%

Length

2021-05-26T22:53:35.246744image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
شركة585643
 
6.1%
العامة212020
 
2.2%
للتجارة175881
 
1.8%
والمقاولات143362
 
1.5%
131687
 
1.4%
الكويت100820
 
1.0%
خدمات89090
 
0.9%
وصيانة68642
 
0.7%
اعمال61724
 
0.6%
العامه56008
 
0.6%
Other values (47598)8035801
83.2%

Most occurring characters

ValueCountFrequency (%)
ا9805328
15.0%
8845785
13.6%
ل7537818
11.6%
ي3670610
 
5.6%
م3588597
 
5.5%
و3061984
 
4.7%
ر3002295
 
4.6%
ة2934440
 
4.5%
ت2636526
 
4.0%
ن1980191
 
3.0%
Other values (87)18198083
27.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter55405898
84.9%
Space Separator8845785
 
13.6%
Decimal Number442486
 
0.7%
Open Punctuation143008
 
0.2%
Close Punctuation126191
 
0.2%
Other Punctuation117674
 
0.2%
Dash Punctuation107636
 
0.2%
Uppercase Letter49204
 
0.1%
Modifier Letter23150
 
< 0.1%
Lowercase Letter415
 
< 0.1%
Other values (3)210
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ا9805328
17.7%
ل7537818
13.6%
ي3670610
 
6.6%
م3588597
 
6.5%
و3061984
 
5.5%
ر3002295
 
5.4%
ة2934440
 
5.3%
ت2636526
 
4.8%
ن1980191
 
3.6%
ع1645339
 
3.0%
Other values (26)15542770
28.1%
Uppercase Letter
ValueCountFrequency (%)
N7089
14.4%
C6631
13.5%
G6543
13.3%
R5483
11.1%
P4954
10.1%
L2911
 
5.9%
A2521
 
5.1%
B1793
 
3.6%
H1632
 
3.3%
I1524
 
3.1%
Other values (13)8123
16.5%
Other Punctuation
ValueCountFrequency (%)
/95413
81.1%
.13342
 
11.3%
,6256
 
5.3%
&1949
 
1.7%
،384
 
0.3%
"256
 
0.2%
\49
 
< 0.1%
*16
 
< 0.1%
:7
 
< 0.1%
%2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1111811
25.3%
069668
15.7%
259821
13.5%
347456
10.7%
728861
 
6.5%
528505
 
6.4%
428474
 
6.4%
823576
 
5.3%
623569
 
5.3%
920745
 
4.7%
Nonspacing Mark
ValueCountFrequency (%)
ٌ18
28.1%
ً18
28.1%
ُ15
23.4%
َ9
14.1%
ِ4
 
6.2%
Math Symbol
ValueCountFrequency (%)
+55
93.2%
>3
 
5.1%
|1
 
1.7%
Lowercase Letter
ValueCountFrequency (%)
l412
99.3%
x2
 
0.5%
z1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
)126190
> 99.9%
]1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
8845785
100.0%
Open Punctuation
ValueCountFrequency (%)
(143008
100.0%
Dash Punctuation
ValueCountFrequency (%)
-107636
100.0%
Modifier Letter
ValueCountFrequency (%)
ـ23150
100.0%
Connector Punctuation
ValueCountFrequency (%)
_87
100.0%

Most occurring scripts

ValueCountFrequency (%)
Arabic55405898
84.9%
Common9806076
 
15.0%
Latin49619
 
0.1%
Inherited64
 
< 0.1%

Most frequent character per script

Arabic
ValueCountFrequency (%)
ا9805328
17.7%
ل7537818
13.6%
ي3670610
 
6.6%
م3588597
 
6.5%
و3061984
 
5.5%
ر3002295
 
5.4%
ة2934440
 
5.3%
ت2636526
 
4.8%
ن1980191
 
3.6%
ع1645339
 
3.0%
Other values (26)15542770
28.1%
Common
ValueCountFrequency (%)
8845785
90.2%
(143008
 
1.5%
)126190
 
1.3%
1111811
 
1.1%
-107636
 
1.1%
/95413
 
1.0%
069668
 
0.7%
259821
 
0.6%
347456
 
0.5%
728861
 
0.3%
Other values (20)170427
 
1.7%
Latin
ValueCountFrequency (%)
N7089
14.3%
C6631
13.4%
G6543
13.2%
R5483
11.1%
P4954
10.0%
L2911
 
5.9%
A2521
 
5.1%
B1793
 
3.6%
H1632
 
3.3%
I1524
 
3.1%
Other values (16)8538
17.2%
Inherited
ValueCountFrequency (%)
ٌ18
28.1%
ً18
28.1%
ُ15
23.4%
َ9
14.1%
ِ4
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Arabic55429496
84.9%
ASCII9832161
 
15.1%

Most frequent character per block

Arabic
ValueCountFrequency (%)
ا9805328
17.7%
ل7537818
13.6%
ي3670610
 
6.6%
م3588597
 
6.5%
و3061984
 
5.5%
ر3002295
 
5.4%
ة2934440
 
5.3%
ت2636526
 
4.8%
ن1980191
 
3.6%
ع1645339
 
3.0%
Other values (33)15566368
28.1%
ASCII
ValueCountFrequency (%)
8845785
90.0%
(143008
 
1.5%
)126190
 
1.3%
1111811
 
1.1%
-107636
 
1.1%
/95413
 
1.0%
069668
 
0.7%
259821
 
0.6%
347456
 
0.5%
728861
 
0.3%
Other values (44)196512
 
2.0%
Distinct4826
Distinct (%)0.3%
Missing42
Missing (%)< 0.1%
Memory size12.0 MiB
Minimum1966-12-12 00:00:00
Maximum2020-12-31 00:00:00
2021-05-26T22:53:35.475292image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:53:35.664513image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

ADDRESS_AUTO_NO
Real number (ℝ≥0)

MISSING

Distinct112316
Distinct (%)9.2%
Missing354729
Missing (%)22.5%
Infinite0
Infinite (%)0.0%
Mean16104080.23
Minimum0
Maximum99999999
Zeros1189
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2021-05-26T22:53:35.877920image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10222598
Q113214876
median16388361
Q318894778
95-th percentile20882688
Maximum99999999
Range99999999
Interquartile range (IQR)5679902

Descriptive statistics

Standard deviation4275012.298
Coefficient of variation (CV)0.2654614381
Kurtosis134.9536085
Mean16104080.23
Median Absolute Deviation (MAD)2654096
Skewness6.752699145
Sum1.963588217 × 1013
Variance1.827573015 × 1013
MonotonicityNot monotonic
2021-05-26T22:53:36.096824image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
206791965507
 
0.3%
105558785292
 
0.3%
105954744613
 
0.3%
207742063470
 
0.2%
106158343401
 
0.2%
211077042949
 
0.2%
102449732633
 
0.2%
202432962451
 
0.2%
103103952368
 
0.2%
100816252245
 
0.1%
Other values (112306)1184382
75.2%
(Missing)354729
 
22.5%
ValueCountFrequency (%)
01189
0.1%
10000012214
 
< 0.1%
1000014331
 
< 0.1%
100001511
 
< 0.1%
1000018611
 
< 0.1%
1000019416
 
< 0.1%
100003545
 
< 0.1%
100006881
 
< 0.1%
100012424
 
< 0.1%
1000129324
 
< 0.1%
ValueCountFrequency (%)
999999991126
0.1%
214826861
 
< 0.1%
2147494214
 
< 0.1%
214733418
 
< 0.1%
214725682
 
< 0.1%
214718722
 
< 0.1%
214717171
 
< 0.1%
2147170912
 
< 0.1%
214709681
 
< 0.1%
214696414
 
< 0.1%

ONR_ID
Real number (ℝ≥0)

HIGH CORRELATION

Distinct83888
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.263286075 × 1011
Minimum18200000
Maximum9.99779 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2021-05-26T22:53:36.336818image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum18200000
5-th percentile1.0202 × 1010
Q11.276715 × 1011
median2.133913 × 1011
Q32.80479 × 1011
95-th percentile5.97241 × 1011
Maximum9.99779 × 1011
Range9.997608 × 1011
Interquartile range (IQR)1.528075 × 1011

Descriptive statistics

Standard deviation1.640134387 × 1011
Coefficient of variation (CV)0.7246694992
Kurtosis5.118960536
Mean2.263286075 × 1011
Median Absolute Deviation (MAD)7.868980069 × 1010
Skewness1.863400769
Sum3.562502814 × 1017
Variance2.690040806 × 1022
MonotonicityNot monotonic
2021-05-26T22:53:36.537605image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.276715 × 101110922
 
0.7%
2.53696 × 10115292
 
0.3%
1.05998 × 10115062
 
0.3%
22016000334750
 
0.3%
1.523245 × 10114613
 
0.3%
3.67998 × 10114037
 
0.3%
7.777777778 × 10113934
 
0.2%
5.86047 × 10113656
 
0.2%
1.42019 × 10103627
 
0.2%
2.24213 × 10113462
 
0.2%
Other values (83878)1524685
96.9%
ValueCountFrequency (%)
1820000080
 
< 0.1%
1301200003
 
< 0.1%
20160000138
 
< 0.1%
12014000031
 
< 0.1%
1201400009989
0.1%
12015000062
 
< 0.1%
120150000796
 
< 0.1%
12015000179
 
< 0.1%
12015000202
 
< 0.1%
1201600002340
 
< 0.1%
ValueCountFrequency (%)
9.99779 × 10118
 
< 0.1%
9.9901 × 10114
 
< 0.1%
9.9712 × 101140
 
< 0.1%
9.96836 × 10114
 
< 0.1%
9.9637 × 1011105
< 0.1%
9.96326 × 10117
 
< 0.1%
9.9632 × 101117
 
< 0.1%
9.9621 × 101185
< 0.1%
9.96111 × 10117
 
< 0.1%
9.95898 × 101119
 
< 0.1%

جنسية
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size90.1 MiB
2.0
1502732 
1.0
 
71308

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters4722120
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row2.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.01502732
95.5%
1.071308
 
4.5%

Length

2021-05-26T22:53:36.939479image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-26T22:53:37.060477image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
2.01502732
95.5%
1.071308
 
4.5%

Most occurring characters

ValueCountFrequency (%)
.1574040
33.3%
01574040
33.3%
21502732
31.8%
171308
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3148080
66.7%
Other Punctuation1574040
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01574040
50.0%
21502732
47.7%
171308
 
2.3%
Other Punctuation
ValueCountFrequency (%)
.1574040
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common4722120
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.1574040
33.3%
01574040
33.3%
21502732
31.8%
171308
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII4722120
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.1574040
33.3%
01574040
33.3%
21502732
31.8%
171308
 
1.5%

Age
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct97
Distinct (%)< 0.1%
Missing500
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean40.54873786
Minimum-27.6
Maximum143.39
Zeros0
Zeros (%)0.0%
Negative25
Negative (%)< 0.1%
Memory size12.0 MiB
2021-05-26T22:53:37.183010image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-27.6
5-th percentile26.4
Q132.4
median39.4
Q347.4
95-th percentile59.4
Maximum143.39
Range170.99
Interquartile range (IQR)15

Descriptive statistics

Standard deviation10.38886196
Coefficient of variation (CV)0.2562067898
Kurtosis0.1404874705
Mean40.54873786
Median Absolute Deviation (MAD)7
Skewness0.6786459787
Sum63805060.98
Variance107.9284528
MonotonicityNot monotonic
2021-05-26T22:53:37.365860image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.465641
 
4.2%
34.459774
 
3.8%
30.458931
 
3.7%
36.458435
 
3.7%
35.458112
 
3.7%
37.457373
 
3.6%
31.457270
 
3.6%
39.456847
 
3.6%
29.456816
 
3.6%
38.456737
 
3.6%
Other values (87)987604
62.7%
ValueCountFrequency (%)
-27.67
< 0.1%
-26.62
 
< 0.1%
-25.64
< 0.1%
-24.62
 
< 0.1%
-23.61
 
< 0.1%
-22.63
< 0.1%
-21.61
 
< 0.1%
-20.61
 
< 0.1%
-18.61
 
< 0.1%
-16.61
 
< 0.1%
ValueCountFrequency (%)
143.391
 
< 0.1%
132.391
 
< 0.1%
113.41
 
< 0.1%
99.43
< 0.1%
98.42
 
< 0.1%
96.42
 
< 0.1%
95.43
< 0.1%
94.44
< 0.1%
93.47
< 0.1%
92.46
< 0.1%

Age Group
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing587
Missing (%)< 0.1%
Memory size1.5 MiB
40-49
585564 
50-59
440295 
30-39
248027 
60-69
222444 
70-79
65907 
Other values (3)
 
11216

Length

Max length16
Median length5
Mean length5.00480599
Min length5

Characters and Unicode

Total characters7874827
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row60-69
2nd row50-59
3rd row50-59
4th row60-69
5th row30-39

Common Values

ValueCountFrequency (%)
40-49585564
37.2%
50-59440295
28.0%
30-39248027
15.8%
60-69222444
 
14.1%
70-7965907
 
4.2%
80-899989
 
0.6%
Not Defined1187
 
0.1%
Not Defined20-2940
 
< 0.1%
(Missing)587
 
< 0.1%

Length

2021-05-26T22:53:38.054505image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-26T22:53:38.179897image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
40-49585564
37.2%
50-59440295
28.0%
30-39248027
15.8%
60-69222444
 
14.1%
70-7965907
 
4.2%
80-899989
 
0.6%
not1227
 
0.1%
defined1187
 
0.1%
defined20-2940
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
01572266
20.0%
-1572266
20.0%
91572266
20.0%
41171128
14.9%
5880590
11.2%
3496054
 
6.3%
6444888
 
5.6%
7131814
 
1.7%
819978
 
0.3%
e2454
 
< 0.1%
Other values (10)11123
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number6289064
79.9%
Dash Punctuation1572266
 
20.0%
Lowercase Letter9816
 
0.1%
Uppercase Letter2454
 
< 0.1%
Space Separator1227
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01572266
25.0%
91572266
25.0%
41171128
18.6%
5880590
14.0%
3496054
 
7.9%
6444888
 
7.1%
7131814
 
2.1%
819978
 
0.3%
280
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
e2454
25.0%
o1227
12.5%
t1227
12.5%
f1227
12.5%
i1227
12.5%
n1227
12.5%
d1227
12.5%
Uppercase Letter
ValueCountFrequency (%)
N1227
50.0%
D1227
50.0%
Dash Punctuation
ValueCountFrequency (%)
-1572266
100.0%
Space Separator
ValueCountFrequency (%)
1227
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common7862557
99.8%
Latin12270
 
0.2%

Most frequent character per script

Common
ValueCountFrequency (%)
01572266
20.0%
-1572266
20.0%
91572266
20.0%
41171128
14.9%
5880590
11.2%
3496054
 
6.3%
6444888
 
5.7%
7131814
 
1.7%
819978
 
0.3%
1227
 
< 0.1%
Latin
ValueCountFrequency (%)
e2454
20.0%
N1227
10.0%
o1227
10.0%
t1227
10.0%
D1227
10.0%
f1227
10.0%
i1227
10.0%
n1227
10.0%
d1227
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII7874827
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01572266
20.0%
-1572266
20.0%
91572266
20.0%
41171128
14.9%
5880590
11.2%
3496054
 
6.3%
6444888
 
5.6%
7131814
 
1.7%
819978
 
0.3%
e2454
 
< 0.1%
Other values (10)11123
 
0.1%

Interactions

2021-05-26T22:45:22.757620image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:24.436457image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:25.750335image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:27.129394image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:29.837484image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:31.758567image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:32.899456image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:33.889972image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:34.973351image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:35.774510image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:37.516346image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:38.700783image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:39.745747image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:40.700141image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:41.651723image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:42.559920image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:43.504098image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:44.483322image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:45.506844image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:46.461639image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:47.402675image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:48.175986image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:49.116392image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:50.110335image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:51.141048image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:52.212880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:53.260024image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:54.296450image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:55.288171image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:56.395472image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:57.541408image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:58.648940image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:45:59.881145image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:00.740987image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:01.763008image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:02.883369image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:03.897477image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:04.888104image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:05.878814image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:07.031535image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:07.897418image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:09.902789image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:11.151014image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:33.065671image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:34.028807image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:34.883227image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:36.479841image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:37.767470image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:38.996183image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:39.990672image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:40.944541image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:41.907612image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:42.940559image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:43.981307image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:44.921387image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:45.886564image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:46.951874image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:47.925307image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:48.906902image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:49.921148image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:50.924106image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:52.000425image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:52.972641image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:53.939049image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:54.937381image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:55.915898image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:57.098734image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:58.109215image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:46:59.929189image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:00.854054image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:01.885294image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:03.042091image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:04.295026image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:05.364166image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:06.449916image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:07.443181image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:08.391633image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:09.356362image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:10.368654image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:11.456777image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:12.784007image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:13.726707image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:14.801831image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:16.042894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:17.124997image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:18.154243image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:19.466017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:20.500385image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:21.553551image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:22.549169image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:23.555529image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:24.449092image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:25.435384image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:26.217337image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:27.178125image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:28.232304image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:29.266539image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:30.267952image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:31.223014image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:32.226643image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:33.212396image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:34.200854image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:35.337637image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:36.296173image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:37.291181image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:38.126221image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:39.177274image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:40.243095image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:41.122785image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:42.031625image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:42.814525image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:43.983766image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:44.803574image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:45.625808image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:46.460577image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:47.289360image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:48.125127image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:48.958894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:49.804634image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:50.705223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:51.770375image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:52.902350image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:53.905663image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:54.877595image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:55.800128image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:56.721382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:57.612000image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:58.747808image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:47:59.776592image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:00.565483image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:01.524916image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:02.591514image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:03.811253image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:04.805978image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:05.927281image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:06.983254image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:08.151129image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:09.534432image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:10.659420image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:11.768925image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:12.687518image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:13.504407image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:14.501442image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-26T22:48:15.646114image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-05-26T22:53:38.528758image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-26T22:53:38.956278image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-26T22:53:39.351726image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-26T22:53:39.763070image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-05-26T22:53:40.163597image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-05-26T22:49:21.441747image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-05-26T22:50:04.904196image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-05-26T22:51:35.324916image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-05-26T22:52:31.241122image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

CIVIL_IDBIRTH_DATECOUNTRY_CODECOUNTRY_DESCGENDER_CODEGENDER_DESCRLGION_CODERLGION_DESCJOB_CODEJOB_DESCSECTORECONOMIC_ACT_CODEECONOMIC_ACT_DESCEDUCATION_CODEEDUCATION_DESCMAJOR_CODESALARYSALARY_TYPEONR_GVRN_CODEGOVERNORATE_DESCMARITAL_STATUS_CODEMARITAL_STATUS_DESCCOMPANY_NAMEHIRE_DATEADDRESS_AUTO_NOONR_IDجنسيةAgeAge Group
02711212010141971-01-01101.0الكويت2.0انثىNaN43390مسئولPRIVATE61244.0التجارة العامة و المقاولاتNaNNaN200.0NaN5محافظة الفروانية2متزوجمؤسسة الرهيب الوطنية للتجارة العامة والمقاولات2020-02-1813096531.02.711212e+111.050.460-69
12760325038811976-01-01721.0باكستان1.0ذكر1.0مسلم83190حدادPRIVATE61244.0التجارة العامة و المقاولات45.0متوسط453501.080.0NaN5محافظة الفروانية2متزوجمؤسسة الرهيب الوطنية للتجارة العامة والمقاولات2008-04-1513096531.02.711212e+112.045.450-59
22761210047181976-01-01107.0مصـــر1.0ذكر1.0مسلم94985نقاشPRIVATE61244.0التجارة العامة و المقاولات45.0متوسط453501.0100.0NaN5محافظة الفروانية1أعزبمؤسسة الرهيب الوطنية للتجارة العامة والمقاولات2009-04-0613096531.02.711212e+112.045.450-59
32700508019141970-01-01107.0مصـــر1.0ذكر1.0مسلم94985نقاشPRIVATE61244.0التجارة العامة و المقاولات45.0متوسط453501.0300.0NaN5محافظة الفروانية2متزوجمؤسسة الرهيب الوطنية للتجارة العامة والمقاولات2010-02-1613096531.02.711212e+112.051.460-69
42940613039761994-01-01107.0مصـــر1.0ذكر1.0مسلم3560فنى كهربائيPRIVATE51204.0مقاولات انشاءات كهربائية وميكانيكية مثل محطات توليد الكهرباء45.0متوسط453501.0150.0NaN5محافظة الفروانية2متزوجشركة الفيالق الكويتية للتجارة العامة والمقاولات2019-06-3013094464.02.186546e+112.027.430-39
52810101208781981-01-01701.0افغانستان1.0ذكر1.0مسلم3560فنى كهربائيPRIVATE51204.0مقاولات انشاءات كهربائية وميكانيكية مثل محطات توليد الكهرباء45.0متوسط453501.0100.0NaN5محافظة الفروانية2متزوجشركة الفيالق الكويتية للتجارة العامة والمقاولات2008-09-2113094464.02.186546e+112.040.450-59
62770720091531977-01-01107.0مصـــر1.0ذكر1.0مسلم3560فنى كهربائيPRIVATE51204.0مقاولات انشاءات كهربائية وميكانيكية مثل محطات توليد الكهرباء45.0متوسط453501.0150.0NaN5محافظة الفروانية2متزوجشركة الفيالق الكويتية للتجارة العامة والمقاولات2018-12-0413094464.02.186546e+112.044.450-59
72921020042291992-01-01107.0مصـــر1.0ذكر1.0مسلم3560فنى كهربائيPRIVATE51204.0مقاولات انشاءات كهربائية وميكانيكية مثل محطات توليد الكهرباء35.0ثانوية972370.0150.0NaN5محافظة الفروانية1أعزبشركة الفيالق الكويتية للتجارة العامة والمقاولات2017-02-2713094464.02.186546e+112.029.430-39
82890623034291989-01-01709.0الهنــد1.0ذكر0.0ديانات أخري37010موزعPRIVATE51204.0مقاولات انشاءات كهربائية وميكانيكية مثل محطات توليد الكهرباء45.0متوسط453501.0120.0NaN5محافظة الفروانية2متزوجشركة الفيالق الكويتية للتجارة العامة والمقاولات2013-10-2113094464.02.186546e+112.032.440-49
92680401038531968-01-01110.0ســوريا1.0ذكر1.0مسلم37010موزعPRIVATE51204.0مقاولات انشاءات كهربائية وميكانيكية مثل محطات توليد الكهرباء45.0متوسط453501.0450.0NaN5محافظة الفروانية2متزوجشركة الفيالق الكويتية للتجارة العامة والمقاولات2009-01-1313094464.02.186546e+112.053.460-69

Last rows

CIVIL_IDBIRTH_DATECOUNTRY_CODECOUNTRY_DESCGENDER_CODEGENDER_DESCRLGION_CODERLGION_DESCJOB_CODEJOB_DESCSECTORECONOMIC_ACT_CODEECONOMIC_ACT_DESCEDUCATION_CODEEDUCATION_DESCMAJOR_CODESALARYSALARY_TYPEONR_GVRN_CODEGOVERNORATE_DESCMARITAL_STATUS_CODEMARITAL_STATUS_DESCCOMPANY_NAMEHIRE_DATEADDRESS_AUTO_NOONR_IDجنسيةAgeAge Group
15740302880810029261988-01-01107.0مصـــر1.0ذكر1.0مسلم99320عامل عادى خفيفPRIVATE83102.0شراء وبيع الاراضي والعقارات وتقسيمها45.0متوسط453501.0300.0NaN5محافظة الفروانية1أعزبشركة الجماعه العقاريه2009-07-0912468196.01.520047e+112.033.440-49
15740312900902035891990-01-01107.0مصـــر1.0ذكر1.0مسلم45290بائعPRIVATE62159.0الاسواق المركزية45.0متوسط453501.0200.0NaN1محافظة العاصمة2متزوجشركة سوق المدينة الفلسطيني المركزي2019-05-0119138157.01.120048e+112.031.440-49
15740322581115004281958-01-01702.0بنجلاديش1.0ذكر1.0مسلم45290بائعPRIVATE61172.0تجارة مستحضرات التجميل والعطورات35.0ثانوية353002.0650.0NaN1محافظة العاصمة2متزوجشركة طيب كنوز الكويت للعطور2008-12-1720994137.01.120048e+112.063.470-79
15740332750302027221975-01-01709.0الهنــد1.0ذكر3.0هندوسي33131كاشيرPRIVATE63102.0المطاعم45.0متوسط453501.0600.0NaN3محافظة الاحمدي2متزوجشركه مطعم بومباي برياني2008-11-0519438088.04.740391e+112.046.450-59
15740342720614041471972-01-01709.0الهنــد1.0ذكر3.0هندوسي53190طباخPRIVATE63102.0المطاعم45.0متوسط45.0180.0NaN3محافظة الاحمدي2متزوجشركه مطعم بومباي برياني2009-02-0319438088.04.740391e+112.049.450-59
15740352930420045971993-01-01709.0الهنــد1.0ذكر0.0ديانات أخري93990عامل انتاجPRIVATE63102.0المطاعم45.0متوسط453501.0150.0NaN3محافظة الاحمدي1أعزبشركه مطعم بومباي برياني2017-02-0719438088.04.740391e+112.028.430-39
15740362910701168881991-01-01709.0الهنــد1.0ذكر0.0ديانات أخري99410عامل مطعمPRIVATE63102.0المطاعم45.0متوسط453501.0120.0NaN3محافظة الاحمدي1أعزبشركه مطعم بومباي برياني2018-08-2919438088.04.740391e+112.030.440-49
15740372870111002781987-01-01711.0ايــران1.0ذكر1.0مسلم45290بائعPRIVATE62187.0الأثاث والمفروشات45.0متوسط453501.0450.0NaN5محافظة الفروانية2متزوجشركة الكناني استار للاثاث والمفروشات2009-09-1419669564.01.520047e+112.034.440-49
15740382990101090681999-01-01110.0ســوريا1.0ذكر1.0مسلم98515سائق مركبه خفيفهPRIVATE62187.0الأثاث والمفروشات45.0متوسط453501.0200.0NaN5محافظة الفروانية2متزوجشركة الكناني استار للاثاث والمفروشات2019-04-2419669564.01.520047e+112.022.430-39
15740392920401022011992-01-01107.0مصـــر1.0ذكر1.0مسلم19530مترجمPRIVATE71912.0مكاتب السياحة والسفر35.0ثانوية353002.0700.0NaN1محافظة العاصمة2متزوجشركة لوفيت للسياحه والسفر2013-04-2810139428.01.120048e+112.029.430-39